NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

From Optimization Dynamics to Generalization Bounds via {\L}ojasiewicz Gradient Inequality

Fusheng Liu; Haizhao Yang; Soufiane Hayou; Qianxiao Li (October 2022, Transactions on machine learning research)
Yiming Ying (Ed.)
Optimization and generalization are two essential aspects of statistical machine learning. In this paper, we propose a framework to connect optimization with generalization by analyz- ing the generalization error based on the optimization trajectory under the gradient flow algorithm. The key ingredient of this framework is the Uniform-LGI, a property that is generally satisfied when training machine learning models. Leveraging the Uniform-LGI, we first derive convergence rates for gradient flow algorithm, then we give generalization bounds for a large class of machine learning models. We further apply our framework to three distinct machine learning models: linear regression, kernel regression, and two-layer neural networks. Through our approach, we obtain generalization estimates that match or extend previous results.
more » « less
Full Text Available
Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed Number of Neurons

Shijun Zhang; Zuowei Shen; Haizhao Yang (January 2022, Journal of machine learning research)
Ruslan Salakhutdinov (Ed.)
This paper develops simple feed-forward neural networks that achieve the universal approximation property for all continuous functions with a fixed finite number of neurons. These neural networks are simple because they are designed with a simple and computable continuous activation function $$\sigma$$ leveraging a triangular-wave function and a softsign function. We prove that $$\sigma$$-activated networks with width $36d(2d+1)$ and depth $11$ can approximate any continuous function on a $$d$$-dimensional hypercube within an arbitrarily small error. Hence, for supervised learning and its related regression problems, the hypothesis space generated by these networks with a size not smaller than $$36d(2d+1)\times 11$$ is dense in the space of continuous functions. Furthermore, classification functions arising from image and signal classification are in the hypothesis space generated by $$\sigma$$-activated networks with width $36d(2d+1)$ and depth $12$, when there exist pairwise disjoint closed bounded subsets of $$\mathbb{R}^d$$ such that the samples of the same class are located in the same subset.
more » « less
Full Text Available
Integral Autoencoder Network for Discretization-Invariant Learning

Yong Zheng Ong; Zuowei Shen; Haizhao Yang (January 2022, Journal of machine learning research)
Stefan Harmeling (Ed.)
Discretization invariant learning aims at learning in the infinite-dimensional function spaces with the capacity to process heterogeneous discrete representations of functions as inputs and/or outputs of a learning model. This paper proposes a novel deep learning framework based on integral autoencoders (IAE-Net) for discretization invariant learning. The basic building block of IAE-Net consists of an encoder and a decoder as integral transforms with data-driven kernels, and a fully connected neural network between the encoder and decoder. This basic building block is applied in parallel in a wide multi-channel structure, which is repeatedly composed to form a deep and densely connected neural network with skip connections as IAE-Net. IAE-Net is trained with randomized data augmentation that generates training data with heterogeneous structures to facilitate the performance of discretization invariant learning. The proposed IAE-Net is tested with various applications in predictive data science, solving forward and inverse problems in scientific computing, and signal/image processing. Compared with alternatives in the literature, IAE-Net achieves state-of-the-art performance in existing
more » « less
Full Text Available

Search for: All records